by ܴସܴଷܴଶܴଵܴଵ. The cleavage happened between ܴଵ and ܴଵ.
each peptide of length d was denoted by a vector ܠ∈ࣝௗ, where
set of the amino acids. In most protease cleavage data analysis
all peptides had the same length. A set of peptides was denoted
Ω⋃Ω. Ω was used to denote a set of the non-cleaved peptides
was used to denote a set of the cleaved peptides.
e starting the description of how to use GP to model factor Xa
cleavage data, the first thing was how to quantify the similarity
residues, which were the amino acids, from two peptides. The
io-basis function [Thomson and Yang, 2002; Thomson et al.,
s used to measure how a rule fitted a data point (an amino acid).
the mth residue of a peptide x was denoted by ݔ and the mth
efined in a GP rule r was denoted by ݎ. The fitness was defined
, where ߨሺሻ was the mutation probability between two amino
ed on a mutation matrix,
ߨሺݔ, ݎሻ
(8.11)
ose a number of residues between a peptide and a rule were
in decision-making. The min-max function was developed for
of analysis [Yang, et al., 2003]. The min function was defined as
here x was a peptide and r was a GP rule,
߰ାሺܠ, ܚሻൌmin
ሼߨሺݔ, ݎሻሽ
(8.12)
press the min function, a RPN chromosome was expressed as
where ݎ was the mth residue of the GP rule r and ࣷ was the
id used by ݎ,
߰ାሺܠ, ܚሻൌቄෑሺݎࣷሻቅ
(8.13)
the RPN chromosome to represent a rule, the residue indexes in
were encoded using the letters, such as a, b, c, etc. For instance,
le of the min function was (aYdS)+. In this example, a and d were
nd the fourth residues of a peptide while Y and S were two amino
d in this rule for two residues, respectively. The fitness of this